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I. INTRODUCTION 

When uncertainty is present, several approaches to decision making are used depending on the 
particular details of the problem being solved. If the main difficulty lies in a large number of 
possible solutions and a complex structure of the feasible region, optimization methods are usually 
used. If the number of possible solutions is relatively small and the main difficulty lies in the 
process of updating the initial information, decision theoretic methods are appropriate. In Markov 
decision processes and stochastic optimal control, additional assumptions (such as Markovian or 
Gaussian property) are made which allows one to obtain solutions with special properties making 
it possible to handle the dynamic aspect of the problem efficiently. 

Regardless of the particular solution method, however, the common story of all such problem 
is information, or, more specifically, the lack thereof. In its current state, the fundamental quanti- 
tative theory of information is represented by Information Theory which, in spite of a number of 
fruitful connections with a variety of fields, is still predominantly a theory of information transmis- 
sion. As such, it is concerned with information quantity and largely (if not entirely) oblivious to 
the possible content of information, including its accuracy and relevance to any kind of a problem. 
On the other hand, in many of its practical applications, the primary role of information lies in 
its ability to influence the quality of various decisions. It is clear that the ability of information 
to play this role depends critically not just on its quantity, but on its accuracy (with respect to 
describing the "true state of affairs" ) and relevance (with respect to the particular problem) . Put 
slightly differently, in its typical applications, information is acquired, then (possibly) transmitted 
and finally used to solve a certain problem. This typical path of information can be termed the 
full information chain (see Fig. [1] for an illustration) which currently lacks[l|] its basic fundamental 
theory with the sole exception of the middle (transmission) link. 

This article is part of an effort to extend the classical Information Theory to a theory of the 
full information chain - including the two "end links" . Since these two links appear to be logically 
closely connected, the proposed extension has to take a form of a single joint theory meaning, in 
particular, that any design decisions (similar to source coding of classical Information Theory) can 
only be made when both links for the particular problem are taken into account. Still, due to the 
sheer volume of this task, it would appear reasonable to approach it in steps. Correspondingly, 
a quantitative description on the information acquisition link was addressed in where the 

process of information exchange between an agent (decision maker) and an information source was 
considered. This article's goal is to provide a similar treatment of the information usage link in 
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FIG. 1. The full information chain. 



a general setting. To make this task a bit more specific, the third link of the information chain 
is considered from the Operational Research perspective in that the main problem the agent is 
assumed to be solving is taken to have the form of a typical stochastic optimization problem with 
an objective in the form of an expected value. 



Related work 



The present work can be looked upon as an attempt to extend the classical Information Theory 
to make it useful for optimization and decision making under uncertainty. The field of Information 
Theory, born from Shannon's work on the theory of communications [4] since had great success 
in a number of fields. At the present time it would be impossible to attempt making any sort of 
comprehensive or even representative list of references pertaining to applications of Information 
Theory in communications and other fields. To mention a few more or less randomly selected 



examples, one coold cite applications in statical physios flQ 



computer vision [7], climatology 



[8, 9], physiology 1 10] and neurophysiology ll|. The relatively new field of Generalized Information 
Theory (see e.g. [13]) is concerned with problems of characterizing uncertainty in frameworks that 



are more general t 



lan classical probability such as Dempster-Shafer theory 



131 ] . In particular, it 



was shown in Q, lf| that the minimal uncertainty measure satisfying consistency requirements 
is obtained by maximizing Shannon entropy over all classical probability distributions consistent 
with the given (generalized) belief specification. 

As was mentioned earlier, this article is part of an effort to extend the domain of Information 
Theory to include information acquisition and usage processes. The former of these was previously 
addressed in the classical work of Cox 16H18I] on the foundations of probability and theory of 



inquiry. This line of work received further development recently resulting in a formulation of the 



calculus of inquiry 



19, 



201 ] that, in particular, constructs a distributive lattice of questions dual to 



the Boolean lattice of logical assertions. The definition of questions adapted in 



the particular subclass of questions - the partition questions - defined in 19]. Our work in 



2j corresponds to 



goes beyond that on the calculus of inquiry in that it introduces the concept of pseudoenergy as a 
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measure of source specific difficulty of various questions to the given information source. One could 
say that it develops a quantitative theory of knowledge as opposed to the theory of information. 



Information Physics [21|] is a relatively new branch of physical sciences that studies the role 
information plays in fundamental laws of nature. This line of research goes back to the defining 
work of Jaynes f], (| on the application of the Principle of Maximum Entropy (MaxEnt) to derive 
the fundamental laws of thermodynamics. It is related to the proposed framework in that it 
addresses information relevance in application to physical sciences. The main Information Physics 
hypothesis is that the laws of nature are essentially the laws of inductive inference correctly applied 
to respective systems. In order to correctly formulate them one needs to know the degrees of 
freedom and the relevant information necessary to completely specify the system state. Recently, 
this approach (in modified and extended form) was applied to derive the fundamental laws of 



classical 



221 ] and quantum 



231 ] mechanics. 



The idea of obtaining additional information to improve quality of decisions in situations char- 
acterized with uncertainty is obviously not new and has been pursued, for instance, in the area of 



251 ] . fashion decisions 26] 



statistical decision making. Applications to innovation adoption 

and vaccine c omp osition decisions for flu immunization [^tJ can be mentioned in this regard. Some 



authors 



281 ] . 291 ] introduced various models (e.g. effective information model) for accounting for 



the actual, or effective, amount of information contained in the received observations. The com- 
mon theme of this line of work is in trying to find an optimal trade-off between the amount of 
additional information obtained and the suitably measured degree of achieving the original goal. 
The difference of the proposed approach is in that it explicitly describes and allows to optimize 
over not just the quantity of additional information but also its content and is based on explicit 
description of properties of information sources. 

Explicit modeling of information sources that lies at the base of the proposed methodology is 
similar in spirit to analyzing and using information provided by human experts. In many practically 
relevant applications, the role of information sources will likely be played by human experts. In 
existing research literature, the problem of optimal usage of information obtained from experts 
has been addressed mostly in the form of updating the agent's beliefs given probability assessment 



from multiple experts 3CH33] and optimal combining of expert opinions, including experts with 
incoherent and missing outputs 34J]. In the framework developed in the present and related articles 

nn 

[2J,[3J], the emphasis is on optimizing on the particular type of information for the given information 
source and a decision making problem. 
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B. Outline 



In Section HH we summarize the necessary information about the information acquisition link 
of the full information chain. Further details are given in Appendix A. In Section llllj. we describe 
maps from the parameter to the solution space and their properties that will be used later. Sec- 
tion IIVI contains the main part of the article - a quantitative framework for the description of the 
information usage link of the full information chain. In Section El we consider a simple example. 
Finally, Section IVT1 contains a conclusion and a brief discussion of future developments. Appendix B 
provides some of the longer proofs, and Appendix C gives additional examples to illustrate some 
concepts introduced in the main text. 



An agent is assumed to be interested in solving a problem. The latter is necessary to provide a 
context for information relevance. While the nature of the problem can in principle be arbitrary, 
it has to allow for a quantitative characterization of the solution quality, or, equivalently, loss 
(compared to a that achievable in the presence of complete information). To make the discussion 
a bit more specific, we take the problem to be of the following general form. 



Here X C T> is the set of all feasible solutions, i.e. the set satisfying all (deterministic) constraints 
that are present in the problem formulation, where D is the space to which all solutions belong 
(e.g. a suitable Euclidean space). Q has the meaning of a space of possible values of input data 
parameters that are not known with certainty. It is often referred to as a parameter space. P is a 
fixed initial probability measure (with a suitable sigma-algebra 3~ assumed) on (Q, 3~) that describes 
the initial state of information available to the agent. The function /: f2 x D — > R is assumed to 
be integrable on f2 for each x € X. For example, in the context of stochastic optimization, X is 
the set of feasible first-stage solutions and f(u,x) is the best possible objective value for the first 
stage decision x in case when the random outcome u) is observed. 
The natural form of the loss for the formulation JT|) is 



where x* p is a solution of ([TJ and x* is a solution of mm x€ xf{w,x) for the given oj. The agent's 
ultimate goal is in minimizing the loss given the available information source(s). To achieve that 



II. INFORMATION ACQUISITION LINK 




(1) 
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goal, the agent engages in information exchange with the source. This exchange constitutes the 
content of the first link of the information chain. In the course of the information exchange, the 
agent poses questions and the information source provides answers. The agent is assumed to be 
capable of "deciphering" the answers by mapping them to updated probability measures on Q. In 
Appendix A, we present some details of the information exchange process. In particular, we briefly 
describe the notions of question difficulty, answer depth and information source models introduced 
in Q] and Q. 

III. MAPS AND THEIR PROPERTIES 

In what follows, we make use of maps from $7 into X with discrete image sets. Let S be the set 
of all such maps. Since the image set of all maps from S is assumed to be discrete, any such map 
g € S can be uniquely described by the corresponding partition C = {Ci, . . . , C r } of O and the 
corresponding image set I = {xi, . . . ,x r } such that g(u>) = Xj for all oo € C,-. We will sometimes 
write g = (C,I) whenever the components of a map (partition and image set) need to be made 
explicit. 

The following maps from the set S are important special cases that will be referred to later. 

• Optimal ("zero loss") map g^: goi^) = %Z) where rc* is the solution of min xG x/(w, x). It 
simply maps each scenario into the corresponding (deterministic) optimal solution. 

• All-to-one maps g x : g x (w) = x for all oo E S7. These map all elements of into some single 
element of X. 

• For the given measure P on Q, the stochastic optimal map gp: gp(uj) = xp, where x p is a 
solution of ([T]). Obviously, it is just a special case for of all-to-one maps g x . 

• For the given measure P and a (complete) partition C = {C\, . . . , C r } of 0,, the map gp^c- 
fl'p,c( w ) = x *p c . f° r all u; £ Cj, j = 1,... , r. (Here x* Pc is an optimal solution of problem ([1]) 
with measure P replaced with the conditional measure Pcy) I R the following, we denote by 
C the set of all maps of the form gp^c f° r & h possible partitions C of and will sometimes 
refer to maps from the set C as subset-optimal maps. 

Next, we define some useful functionals to be used later. 
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Let P be any probability measure on 0, and x an arbitrary element of the solution space X. 
We define the suboptimality of x with respect to P as follows: 

S(x, P) = E P /(w, x) - E P /(w, xp) = / (/(w, x) - f{ u , x P ))P(du), (2) 

Jn 

i.e. suboptimality of x w.r.t. P is the difference in objective values of problem ([1]) if x is used 
instead of the optimal solution x* P . 

If P is an arbitrary measure on Q and g € 3 is an arbitrary map from $7 into X, we define the 
loss of g with respect to P as 

L(g,P) = E P f(cj,g(u>)) -E P f(cj,x*J = [ {f(u,g(u;))-f(u,x m u ))P(du). (3) 

Jn 

In particular, if g = gp is the stochastic optimal map corresponding to the measure P, the loss 
L(gp,P) is the traditional expected value of perfect information (EVPI). If g = go is the optimal 
map, the loss is equal to zero for any measure P: L(go,P) = 0. 

Finally, for any measure P and map g € 3, we define the gain of g with respect to P as follows: 

B(g, P) = E P f(oj, x* P ) - E P f(u, g(u)) = [ (f(u, x* P ) - f(u, g{u)))P{du). (4) 

Jn 

The gain functional of a map g measures the decrease in loss that can be achieved by the map 
g, compared to the best all-to-one map gp. In particular, the largest possible gain obtains by an 
optimal map go, and for this map, the value of gain is equal to the loss of gp, since any optimal 
map has zero loss. It is also clear that, while suboptimality and loss are always nonnegative, gain 
can take both positive and negative values. For example, the gain of any all-to-one map g x is 
negative unless x = x* p (in which case the gain vanishes). 

The following lemma states an elementary but useful relationship between gain and loss for an 
arbitrary map g from SI into X. The proof of the lemma is straightforward and therefore omitted. 

Lemma 1 For any map g E 3 and any measure P on Q,, 

B(g,P)+L(g,P) = L(g P ,P), 
where gp is the stochastic optimal map for the measure P. 

The statement of Lemma [T] can be rewritten as B(g,P) = L(gp,P) — L(g,P) and, in fact can 
be used as a definition of the gain of arbitrary map g E 3: the gain is equal to the decrease of the 
value of loss compared to the loss of the best all-to-one map gp. 
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Let f{y) — > K be a real-valued functional on the suitably restricted set 7 of measures on Q. 
For the later developments it turns out to be convenient to introduce the following notation. Let 
C = {Ci, • • • ) C r } be a partition of 0, (a question), and let V(C) be an answer to C that can takes 
values in the set {s\, . . . ,s m }. 

We denote by f(Pc) the expected value of the functional /(•) over the set of conditional measures 
{P Cj },j = l,...,r: 

r 

f{P c ) = Y J P{C j )f{Pc ] ), (5) 

i=i 

and by f(Pvtc)) ~ the expected value of /(C) over the set of updated measures {P k }, k = 1, . . . , m: 



in 



k=l k=l 

Then we can define suboptimality, loss and gain functionals for a given question C and an 
answer V(C) using the just introduced notational convention ([5]) and ([6]). 

Namely, for an arbitrary x £ X, the suboptimality of solution x with respect to question C 
(and initial measure P) is given by 

s 

S(x,P c ) = Y,P(C j )S(x,P Cj ), (7) 
i=i 

and the suboptimality of x with respect to answer V(C) to question C (and initial measure P) 
reads 

m 

S(x,P y(c) ) = E^(^)- (8) 
fc=i 

Likewise, for an arbitrary map g G S, and question C, the loss and gain of g with respect to C 
are given by 

r 

L(g,P c ) = J2P(C j )L(g,P C] ), (9) 
i=i 

and 

r 

i=i 

respectively. 

The loss and gain functionals for a map g E 9 with respect to answer V(C) are defined analo- 
gously: 

m 

L(g,P vic) ) = Y J VkL(g,P k ), (11) 

k=l 
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and 

m 

B(g,P v{c) ) = Y,v k B(g,P k ), (12) 
fc=i 

respectively. 

The following representation for the expected loss L(g,P) will be useful later. 
Lemma 2 For any map g = (C,I) £ 9> expected loss L(g,P) can be written as 

r 

L{g,P)=Y J P{C j )L{g,P Cj ) = L{g,P c ). 

3=1 

Proof: See Appendix B. □ 

Let g = (C,J) € C be a subset-optimal map. Then the EVPI for the problem ([1]) can be 
decomposed in a convenient way. 

Lemma 3 For any map gc,P £ C, £/te EVPI L(gp, P) of the problem (QP can be decomposed as 

L(g P ,P) = S(x* P ,Pc) + L(g CiP ,P). 

Proof: See Appendix B. □ 

IV. INFORMATION USAGE LINK 

In this section, a quantitative framework for the description of the third link of the full infor- 
mation chain is discussed. A connection to the first link is made resulting in a formulation of the 
optimal information acquisition problem. 

A. Pseudoenergy-loss efficient frontier 

Let us consider the set 9 of maps from f2 into X. Each map g = (C(g), 1(g)) from this set 
can be characterized by the corresponding loss L(g, P) with respect to the original measure P 
and the value Gr(Q, C(g), P) - the difficulty of the corresponding question. We will be interested 
- for reasons that will become clear shortly - in finding the efficient frontier in the Euclidean 
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plane with coordinates (G(fl, C(g), P), L(g, P)). In other words, we will be looking for the set of 
Pareto-optimal maps that can be found by solving the following parametric optimization problem 

minimize L(g, P) 

9eS ' (13) 

subject to G(n, C{g),P) < 7 

for all values of the parameter 7. 

The first observation we can make is that to find the set of Pareto-optimal maps it is sufficient 
to consider the set of subset-optimal maps C as the following proposition asserts. 

Proposition 1 C C 

Proof: Let g = (C,/) where / = {x%, X2, ■ ■ ■ , x r }. Suppose that g C. Then there exists 
at least one C £ C such that g(C) 7^ x* p . Without loss of generality we can assume that 
C = C\. Consider a different map g' = (C,/') such that I' = {x* Pc , X2, ■ ■ ■ , x r }. Obviously, 
G(Q,C(g'),P) = G(n,C(g),P) (since C(g') = C{g)). On the other hand, 

L(g',P)-L(g,P)=P(C 1 )(L(g',P Cl )-L(g,Pc 1 ))<0, 

since L(g' , Pc x ) takes the minimum value among all maps with the same partition C. We thus find 
that L(g',P) < L(g,P) which means that g ^ 0. □ 

It follows from Proposition [T] that one needs to look no further than the set C of subset-optimal 
maps. Such maps are uniquely characterized by the corresponding partition C only (up to simple 
equivalences). Therefore the task of finding maps that belong to the set C is equivalent to that of 
finding the corresponding partitions of the set f2. 

B. Optimal information acquisition 

Let us now address the optimal information acquisition problem: what question(s) need to be 
asked the given information source in order to obtain the minimum possible loss for ([1]). Given 
a question C = {C%, . . . , C r } to an information source and its answer V(C) taking values in the 
set {s\, . . . ,s m }, we denote by £(sfc), k = 1, . . . , m the minimum conditional expected loss given 
that V(C) = Sfc and by L(V(C)) the minimum expected loss that the agent can achieve given the 
answer V(C). The latter can be found as 

m 

JC(V(C)) = Pr(nC) = s k )L(s k ), (14) 

k=l 
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i.e. as an expectation over possible values of the answer V(C). 

Clearly, if no answer was received - and the agent has to choose a solution x € X based on the 
original information only - the minimum expected loss will be equal to the EVPI of the original 
problem: £(0) = L{g P ,P). 

If the agent poses a question C = {C\, . . . , C r } to the information source and receives a partic- 
ular value Sk of answer V(C), the original measure Ponfl gets updated to P k . Therefore, in order 
to minimize loss for the given value of answer V(C), the agent needs to choose the solution x* pk 
- the solution minimizing the expectation Mpkf(u),x) over all (feasible) values of x. 

1. Perfect answers 

First, let us assume that the information source can provide a perfect answer to C. Then the 
following result can be obtained. 

Proposition 2 Let C = {C\, . . . ,C r } be a complete question and gc,P € C 6e a corresponding 
subset-optimal map. If the agent is given a perfect answer V*(C) to C then 

£,(V*(C)) = L(g c ,P,P). 

Proof: See Appendix B. □ 

Combining the result of Proposition [2] with Lemma [2] (valid for any g G 9) and Lemma [3] (valid 
for any g € C) we can find the value of the largest loss reduction due to a perfect answer to question 
C. The result is formulated corollary. 

Corollary 1 Given a perfect answer to question C, the largest possible reduction in expected loss 
a agent can achieve is equal to 

£(0) -JC(Y*(C)) = B(g c ,P,P) = S(x P ,P c ), 

where gc,p £ 6 is a sub set- optimal map corresponding to question C. 

2. Imperfect answers 

Now, let us relax the assumption of availability of a perfect answer to question C. Instead, 
we assume that the agent can obtain an answer V(C) which is in general imperfect. First, we 
formulate a useful auxiliary result. 
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Lemma 4 Let V(C) be an answer to question C and let gc,P € C be a corresponding subset- 
optimal map. Then 

S(x* P , P C ) = S(x* P , P v[c) ) + B(g c ,P, Py (C ))- 
Proof: See Appendix B. □ 

Combining the result of Lemma 0] with that of Lemma El we obtain a useful decomposition of 
the EVPI of the original problem which we formulate as a corollary. 

Corollary 2 Let V(C) be an answer to question C and gc,p € C « corresponding subset- optimal 
map. Then 

L(g P ,P) = S(x P ,P vic) ) + B(g c> p,P v{c) ) + L(g c ,P,P). 

Now we can determine the minimum expected loss £(V(C)) that's obtainable with the help of 
an answer V(C) to question C. We state the result as a proposition. 

Proposition 3 Let C = {C\, . . . ,C r } be a complete question and gc,P € C be a corresponding 
subset-optimal map. If the agent is given a (generally imperfect) answer V(C) to C then 

L(V(C)) = B(gc,P,Pv( C) ) + L(g c , P ,P). 
Proof: See Appendix B. □ 

It is easy to see that, for perfect answer V*(C) to question C, the gain B(gc,p, Pv(c)) hi 
Proposition [3] vanishes (since B(gc,p, Py*(C)) = B(gc,p,Pc) = 0) and the result of Proposition [2] 
is recovered. 

The amount of maximum reduction of loss due to answer V(C) to question C can be obtained 
by combining the result of Proposition [3] with that of Corollary [2j The result is formulated as a 
corollary. 

Corollary 3 Given a (generally imperfect) answer to question C, the largest possible reduction in 
expected loss a agent can achieve is equal to 

m-£(v(c)) = s(x P ,p v(c) ). 
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C. Pseudoenergy-loss correspondence 

Comparing results obtained in this section with the corresponding pseudoenergy values discussed 
in Section 4 we can make several interesting observations regarding their correspondence that reveal 
a rather clear picture. We assume that the measure P admits existence of a finest partition of £1. 
Let Cf(P) be such finest partition. We can then summarize the observations made in the previous 
sections as follows. 

• The initial loss is equal to EVPI L(gp,P). In order to reduce it to zero, one needs to com- 
pletely resolve the underlying uncertainty by answering the exhaustive question Cj(P) about 
possible outcomes on SI perfectly. The required pseudoenergy is equal to G(S7, Cj(P), P). 

• A perfect answer to question C (that, as a partition, is some coarsening of Cf(P)) requires 
G(£l, C, P) worth of pseudoenergy from an information source and allows the agent to reduce 
the loss by the amount equal to S(x* P , Pc) = B(gc,p, P)- 

• If the source is able to produce only an imperfect answer V(C) to question C the corre- 
sponding amount of pseudoenergy is equal to the answer depth Y(Q, C, P, V(C)). Such an 
answer can reduce the initial loss L(gp, P) by the amount of S(x* P , Py^). 

• The difference of depths (pseudoenergy contents) between a perfect and an imperfect answers 
to question C is equal to G(Q, C, Py(c))- The corresponding difference in loss reductions 
(values of information) is B(gc,p, Pv(c))- The latter quantity can be naturally interpreted 
as a price the agent pays for imperfection of the answer he/she receives to question C. 

• Given a perfect answer to question C, the residual pseudoenergy measuring the degree of 
difficulty of resolving the remaining uncertainty is equal to G(f2, Cf(P)c, P)- The corre- 
sponding residual loss is simply L(gc,p, P)- 

• Given an imperfect answer to question C, the residual pseudoenergy measuring the degree 
of difficulty of resolving the remaining uncertainty is equal to G(S7, Cf(P), Pyre)) ~ the 
difficulty of the exhaustive question Cf(P) given the answer V(C) to question C. The 
corresponding residual loss is equal to ^feLi v kL(g P k, P k ). 

Table U shows the correspondence between pseudoenergy and loss related quantities discussed 
above. We see that for every loss related quantity there is a corresponding pseudoenergy quantity, 
meaning that in order to reduce the loss by a certain amount the corresponding pseudoenergy has 
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to be made available in the form of an answer to some question. Depending on the structure of 
the question, the amount of loss reduction and, respectively, the amount of residual loss can vary 
in size. The goal of the agent is to find the specific question(s) that would maximize the effect 
of the given information source (characterized by its pseudoenergy functional and source model 
parameters such as capacity) on the given problem. More specifically, the agent would want to find 
the specific question C that would result in the smallest possible minimum expected loss £(V(C)) 
where V(C) is the answer that the source can provide to question C. Formally, this information 
acquisition optimization problem can be written as 

minimize £(V(C)) 

(15) 

subject to Y(n, C, P, V(C)) = h(G(n, C, P)) 

where minimization is performed over all possible partitions of the parameter space $7. The ex- 
pression for the minimum loss £(V(C)) is given either by Proposition [2] (for perfect answers) or 
Proposition [3] (for imperfect answers). 

If a source is capable of perfect answers (for instance, in the simple linear model) solution 
of problem (|15p reduces to finding the efficient frontier: if L*(G) is the expression describing 
the efficient frontier (abstracting from its true discrete structure) and Y s is the capacity of the 
information source, then the minimum in (I15p is equal to L*(Y S ) and is achieved by the question 
C lying on the efficient frontier such that G(fl, C, P) = Y s . 

If a source cannot provide perfect answers (likely a more realistic scenario), one would need 
to consider questions with difficulty exceeding the source capacity (G(Q,C, P) > Y s ) in order to 
minimize the expected loss. The search for an optimal question in this case becomes somewhat 
more complicated as the error structure for the source's answers needs to be taken into account. 
If answers are assumed, for instance, to be quasi-perfect, optimal question(s) can be readily found 
approximately provided the efficient frontier is already known. An illustration is provided in the 
next section. 

The correspondence between pseudoenergy and loss quantities shown in Table 1 can be illus- 
trated by comparing decompositions of the exhaustive question difficulty G(U, Cf (P), P) (expres- 
sion (|16p ) and the EVPI L(gp,P) (expression (|17p ) on the other hand. It is also shown in Fig. [2 



G(n,c f (P),p v(c) ) 



Y(n, c, p, v(c)) + G(n, c, p v{c) ) +G(n,c f (p) c ,p) = G(n,c f (P),p) 

V * ' 

G(n,c,p) 



(16) 
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S(x* p ,P v(c) ) 


answer depth/loss reduction due to that answer 


G(n,c,jv (0 )) 


B(gc,p,P V (c)) 


residual difficulty/ "price" of answer imperfection 


G(n,C/(P)o,P) 


L(gc,p,P) 


residual pscudocncrgy/loss given perfect answer to C 


G(n,c / (P),JV ( o)) 


T.^ k L(9p k ,P k ) 


residual pseudoenergy /loss given an imperfect answer to C 



TABLE I. Correspondence between pseudoenergy and loss related quantities. 

L(g,P) 




G((l,C,P) 



G(n,c /l JV(o) 

FIG. 2. The efficient frontier and correspondence between pseudoenergy and objective function (loss) quan- 
tities. A Pareto-optimal map g G on the efficient frontier is shown. 



^ =1 v k L{g pk ,P*) 

, " V 

S(x* P ,P v(c) ) + B(g Ct p,P v(c) )+L(g CtP ,P) = L(g P ,P) ( 17 ) 

V v ' 

S(x* p ,P c )=B(g c ,P,P) 

V. EXAMPLE 

Suppose a company has to decide on the order quantity x of a certain product and is required 
to satisfy an uncertain demand oj. The cost of ordering is c > per unit of product. If the demand 
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is larger than the ordered quantity, the shortage has to be covered by back ordering at a higher 
cost b > c. If the demand turns out to be lower than the ordered quantity, the extra units are held 
in storage at unit cost of h > 0. Thus the total cost has the form 

f(u, x) = cx + b[x — uj] + + h[u — x] + , (18) 

where [y]+ = max{y, 0} for any real y. We assume that both x and ui are continuous variables, for 
convenience. It is well-known that if the measure on the parameter space is described by a cdf 
F(-) then the optimal solution of the problem 



mm x E P f(uj,x), (19) 



is given by x* P = F 1 1 ! '-'- 



b+h 



Let us assume that the probability measure P is uniform on f2 = [0, a]. Then, clearly, x* p = a^q-§ 
(and therefore gp(co) = clj+^ for all u £ Jl). Consider partitions of such that P(Cj) = wj, 
j = l,...,r and all sets Cj are connected. Just like in the previous example, we can assume, 
without loss of generality that Cj = [au)j,a(wj +Wj)], where Wj = Yli=i w i if J > 1 and w\ = 0. 
It is straightforward to show that the EVPI of this problem is 

a (b-c)(c + h) 



2 b + h 

b-c 



L(g P ,P) 

and, for the partition C = {C\, . . . , C r }, x* Pc = a {^Wj + Wjj^J , and 

r 

L(g c ,P,P) = L(g c ,P,Pc) = £ P{Cj)L{g c , P , P Cj 

Eavjj (b — c)(c + h) 



3 2 b + h 

3=1 



a (b-c)(c + h) 



2 b + h 

Fig. [3] shows the efficient frontier for the case of constant pseudotemperature function which leads 
to u(Cj) = 1 for j = 1, ... ,r and for the case of linear increasing pseudotemperature function 
u(oj) = ~oj which leads to u(Cj) = 2wj + Wj, j = 1, . . . , r. 

Let us now consider quasi-perfect answers V a (C) to question C with partitions C as described 
before. Consider the case r = 2 only, for simplicity. Then C\ = [0,wia] and C2 = [wia,a\. The 
optimal solutions to (|19|) with the original measure P replaced with P k can be shown to be 
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FIG. 3. Efficient frontier for the inventory example: constant pseudotemperature case (dotted line) and 
linear increasing pseudotemperature case (solid line). 



and 



x 



P2 



a 1 



c+h 

l-cnoi b+h 



w 2 



a _ b—c 

k a b+h 



if a < -i- • fef 
if n, > _L . 



(21) 



The suboptimalities S(x P ,P k ) for fc = 1,2 can then be calculated. The resulting expressions are 
too lengthy (and not very illuminating) to be given here. The resulting loss can be found as 



B(g c ,p,P V (C)) +L(g c ,p,P) = L(g P ,P) - S(x* P , P v[c) ), 



(22) 



and the pseudoenergy content of answer V a (C) is simply Y(Q, C, P, V a (C)) given by (IA10p . Let us 
set, for definiteness, c = 1, b = 1.5, h = 0.1 and a = 100. Then the EVPI of the original problem is 
L(gp,P) = 17.19. Let us also consider two information sources, described by the modified linear 
model, with equal capacity of Y s = 0.2 (in the average unit pseudotemperature calibration) and 
same value of parameter b = 0.8. The first source is characterized by a constant pseudotemperature 
function u{oj) = 1 and the second has linear increasing pseudotemperature u{uS) = --uj. The second 
source can be said to have relatively more "knowledge" about lower values of possible demand. 

We are interested in finding, for each source, an r = 2 question C = {C\, C2} an answer to which 
would help the agent minimize the expected loss. This can easily be done numerically, for example, 
by graphing the loss (f22j) against the answer depth Y(Cl, C, P, V a (C)), for different questions C 



18 




FIG. 4. Loss vs. pseudoenergy (for r = 2 questions only) for a source with constant pseudotemperature 
(left) and a source with linear increasing pseudotemperature (right). On both plots, the solid line is obtained 
by varying the parameter w\ from to 1. The dashed line is obtained by fixing a value of W\ and varying a 
from to 1. The value of W\ (characterizing the optimal question) is chosen so that the point of intersection 
of the dashed line and the vertical dotted line (source capacity) has the lowest possible value of the vertical 
coordinate. The latter is equal to the minimum expected loss £(V(C)). 



(in this case, uniquely characterized by a single parameter w\). It turns out (see Fig. [4] for an 
illustration) that the minimum loss at Y(Q, C, P, V a (C)) = Y s = 0.2 is achieved for w\ = 0.25 for 
the first source and w\ = 0.21 for the second source. The minimum loss itself turns out to be equal 
to L(V(C)) = 15.48 for the first source and £(V(C)) = 13.27 for the second source, representing, 
respectively, 10% and 23% loss reduction from the original EVPI of 17.19. Clearly, the reason the 
second source is able to help the agent significantly more is that the latter is capable of utilizing 
the particular "expertise" of the second source by asking a question that is easy for the source and 
thus can be answered relatively well (with error probability a = 0.21). On the other hand, the first 
source answers its "best" question with error probability of a = 0.56 which results - expectedly - 
in a lower loss reduction. Note that the difficulty of the optimal question is equal to 0.80 for the 
first source and 0.41 for the second source, while the depth of the respective answer is equal to 0.2 
(the source's capacity) in both cases. Note also that, in the modified linear model, a source can 
provide an answer of depth equal to capacity Y s whenever the question difficulty exceeds the value 
Y s /b, i.e. the question has to be sufficiently difficult for the source so that the latter can provide 
an answer of maximum depth. 



19 



VI. CONCLUSION 



Despite the role information plays in science and engineering, the fundamental theory of in- 
formation itself is still largely limited to just the middle link of the full information chain that 
generally includes information acquisition, transmission and usage stages (links). The theory of 
the middle link - the classical Information Theory - describes information transmission and can 
be concisely characterized as a theory of information quantity. If a description of the end links of 
the information chain is desired, a theory of information accuracy and relevance is required. 

This article is devoted to development of the basic framework of a theory of the information 
usage link. Since the two end links are closely logically connected, they have to be treated together, 
and the results of on the basics of the information acquisition link are used here to arrive 

at the formulation of the optimal information acquisition problem which, in its elementary form, 
searches for an optimal question to the given information source needed to maximize the solution 
quality (understood as loss reduction) for the given (optimization) problem. Such a question can 
be thought of as a way of achieving an optimal "alignment" between the information source (the 
first link) and the problem (the third link), for the given state of "information background" - the 
initial probability measure. 

Solving the optimal information acquisition problem is facilitated by consideration of the 
Pseudoenergy-Loss efficient frontier in the space of all possible questions. The latter consists of all 
questions that are the most relevant for the given problem among all that are at most as difficult 
for the given source. The knowledge of the efficient frontier enables the agent to (approximately) 
find optimal questions for a source of with a known knowledge structure (described by question 
difficulty functional) and pseudoenergy capacity which can be given an interpretation of the sources 
maximum knowledge depth. It is interesting to observe that the two end links of the information 
chain exhibit a notable symmetry, with pseudoenergy (accuracy) and loss (relevance) quantities 
coming in corresponding pairs. One can talk of a duality between the two links. This duality 
appears to be a manifestation of the tight interconnection between the end links. 

Finally, the problem of finding the efficient frontier of questions appears to be a computationally 
difficult one. Fortunately, it turns out that methods based on probability metrics which were used 
in scenario reduction approaches to stochastic optimization can also be of use for approximate 



efficient frontier determination. This is the main subject of the companion paper 
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Appendix A: Questions, Answers and Source Models 
1. Questions and their difficulty 



A definition of questions was originally given by Cox in [18(. There, a questions was associated 
with a set of all logical assertions that answer it fully. This line was further developed in where 
a distributive lattice of questions was constructed from the lattice of logical assertions, questions 
being associated with down-sets of subsets of elements of the latter lattice. In the context of 
our discussion, a question so defined would be associated with an inclusion- free 361 collection of 

n □ 

subsets of £1. Moreover, real questions of [18| and are associated with inclusion-free collections 
of subsets of O that cover the whole of f2, and partition questions (which will be of primary interest 
to us) correspond to inclusion-free subsets that are partitions of f2, i.e. do not include overlapping 
subsets. Besides standard (complete) partitions, we also make use of incomplete partitions, i.e. 
collections of non-overlapping subsets of that do not fully cover 0. We call questions associated 
with them incomplete questions. Additionally, if such a partition consists of a single subset of 0, 
the corresponding question is called, following [3], an ideal question. 

A difficulty functional G(£l, C, P) can be associated with any question C = {C\, . . . , C r }. The 
particular form of G(f2, C, P) can be determined if some reasonable requirements are imposed. This 
was done in where a particular system of postulates expressing linearity and isotropy properties 
of the difficulty functional was proposed. The main theorem proved in [2J derives the general form 
of the difficulty functional that is required to satisfy such postulates. 



Theorem 1 Let the functional G(£l, C, P) where C = {C±, . . . , C r } satisfy Postulates 1 through 6 
(see Then it has the form 

G(n,c,p) - E r =iP{Cj) , 

f c u(u>) dP(ui) 

where u(Cj) = — j p^.^ and u: (7 — > IR is an integrable nonnegative function on the parameter 

space f2. 

In particular, the difficulty of the given question C depends on, besides the initial probability 
measure P, the function u(-) defined on the parameter space Q. This function may be called 
the pseudotemperature using parallels with thermodynamics. The question difficulty then can be 
interpreted as the amount of pseudoenergy associated with question C. 



21 

If C is an arbitrary refinement [37| of C then the difficulty of the more detailed question C can 
be decomposed as ([2|) 

G(n, c, p) = G(n, c, p) + G(n, c c , p), (ai) 

where the expected residual difficulty of C given a perfect answer to C is defined as 

Cc, P) = E P(C)C(C, Co, Pc). (A2) 
cec 

2. Answers and their depth 

n 

Given a question C on f2, an answer to C was defined in [3| to be a message V(C) taking values 
in the set {si, . . . , s m } such that the reception of the value Sk modifies (updates) the initial measure 
P on Q to the measure P k such that Pq. = Pcj (whenever conditional measures are defined) for 
k = 1, . . . , m and j = 1, . . . , r}. The latter condition ensures that the answer V(C) is indeed an 
answer to the question C (and no more). 

It follows from the above definition that, for V(C) to be an answer to a complete question C, 
it is necessary and sufficient for the updated measures P k , k = 1, . . . , m, to take the form 

r 

P k = ^2PkjP Cj , (A3) 

where pjy , k = 1, . . . , m, j = 1, . . . , r are nonnegative coefficients such that £3 =1 P/y = 1 for k = 
1, . . . ,m. The expression (jA3j) is modified somewhat [3J for incomplete questions. The probability 
of an answer V(C) taking value Sk is denoted by V}.. It is assumed that the updated measures P k , 
k = 1, . . . , m, are consistent with the original measure P in the sense that 



X>P fc = P (A4) 



k=l 

Informally speaking, the condition (|A4|) means that the original measure P is a "valid" one which 
is only "refined" by the information source's answers. 

The answer depth functional Y(Q, C, P, V(C)) for the answer V(C) to question C measures the 
amount of pseudoenergy that is conveyed by V(C) in response to question C. The general form of 
Y(Q, C, P V(C)) can be established if certain requirements it has to satisfy are imposed. This was 



done in 



3y where postulates expressing linearity and isotropy properties were formulated. Under 



these conditions, the following result was obtained. 
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Theorem 2 The answer depth functional Y(Q,C,P,V(C)) has the form 

where P k is the measure on SI updated by the reception o/V(C) = andu(Cj) = p^j-y J^. u(u)dP(ui) 
and the function u: 0, — > R is i/ie same function that is used in the question difficulty functional 

G(n,c,p). 

It can be shown (see [s| for details) that if V(C) is any answer to the question C then 
r(n,C,P,V(C)) < G(n,C,P) with equality if and only if the answer V(C) is perfect, i.e. 
P J = Pc for j = 1, . . . ,r. The difficulty of question C can be written as 

G(n, c, p) = Y(n, c, p, v(c)) + G(n, c, p v{c) ), (A5) 

where 

m r 1 

G(Q, C, iV( C) ) = E "(^O^i) lQ g "BfcTTTT (A6) 

fe=l j=l ^ 5 '' 

can be termed the residual difficulty of C given the answer V(C). Clearly, G(Q, C, Pycc)) — 
with the inequality being tight for a perfect answer V*(C). The residual difficulty G(Q, C,Py(c)) 
can be expressed via coefficients p k j that describe the answer V(C): 



m r . 

G(J2, C, P v(c) ) = EE WkXCj) log — (A7) 



It turns out to be convenient to consider the class of imperfect answers for which t 



le degree 



a. 



For a 



of imperfection is described by a single error probability a - the quasi-perfect answers 
quasi-perfect answer V a (C) to a (complete) question C = {Ci,...,C r }, the coefficients p k j have 
the form 

Pkj = (l-a)6 kJ + aP(C j ), (A8) 

for k = 1, . . . , r and j = 1, . . . , r, and the updated measure P k is simply 

P fc = aP + (l-a)P Cfc . (A9) 

for k = 1, . - - ji". Clearly, for a = a quasi-perfect answer to C becomes a perfect one. It can be 
shown (see [3|]) that the answer depth functional for a quasi-perfect answer V a (C) to question C 
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can be written as 

r 

Y{n, C, P, V a (C)) = Y,<C k )P(C k )(l - a + aP{C k )) log - ~ " j " 

r 

+ aloga^ U (C fc )P(C fc )(l-P(C fc )), 



1 - a + aP(C fc ) 



fc=l 



which is easily seen to reduce to G(f2, C, P) for a = and vanish for a = 1. 



3. Information source models 



(A10) 



The pseudotemperature function «(■) on the parameter space characterizes (under the linear 
isotropic model considered here) the source specific relative difficulty of questions "located" in 
various regions of VL. An information source model relates the value of answer depth to the difficulty 
of the corresponding question. Formally speaking, the existence of information source models is 
based on the following hypothesis 3]. 

Hypothesis SI. For the given information source and any question C, the answer depth is a 
function of the question difficulty: 

Y(n,c,p,v(c)) = h(G(n,c,p)), 

where h : M + — > M + is a function of a single argument. 

The simplest information source model considered in i 



h(x) 



x if x < Y s 
Y< ifa>y,. 



is the simple capacity model given by 

(All) 



which is fully characterized by the single parameter Y s which has the meaning of the information 
source capacity. 

The most apparent drawback of model (|A11|) is that it predicts that the source would provide 
a perfect answer to any question whose difficulty does not exceed the source capacity. The linear 
modified capacity model described by 



h(x) 



bx if X < ■y' 

Y a ifx>£ 



(A12) 



removes this drawback at the expense of one extra parameter b < 1 that has to be estimated. 
Several slightly more complicated models were proposed in Q]. 



24 



The values of model parameters as well as pseudotemperature functions for information sources 
can be estimated from the observed sources' performance on some set of sample questions. Opti- 

ri 

mization based formulations for such estimation were also proposed in [3]. 

It is easy to see that multiplying the pseudotemperature function u(-) by a constant has the 
effect of multiplying both the question difficulty and the answer depth by the same constant and is 
equivalent to a choice of units of pseudoenergy. It turns out to be convenient to use two different 
conventions in this regard. 

• The convention in which u(ui) duo = 1. Here the units of pseudoenergy are chosen in such 
a way that, for constant u(lo), the pseudoenergy coincides with entropy making it convenient 
to make use of the standard intuition about entropy and information. 

• The convention in which each source has unit capacity (Y s = 1). This choice of units of pseu- 
doenergy makes it convenient to compare the "depth of knowledge" of different information 
sources to each other by directly comparing their respective pseudotemperature values at 
the same points of the parameter space. 



Appendix B: Proofs 



1. Proof of Lemma [2] 





1 



(f(u,g(cj))-f(uj,xZ))P(du;) 




j = l J ^3 



r 



(a) 



^P^OL^PcjU L(g,P c ), 



3=1 



where (a) follows directly from the definition of the expected loss for the measure Pc^ and (b) 
follows from the definition ([9]) of L(g,Pc)- 
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2. Proof of Lemma [3] 



We have 



L{g P ,P)= I (J(u,x* P )-f(u,,x* u ))P(du) 
n 

(f(u, x* P ) - f(u, x* ) + f(uj, gc,p( u )) ~ /( w > 9c,p(u)))P(du) 

i 

(f(u,x*p) - f(u,gc,p(u)))P((ko) 

i 

! / (f(u,gcA u ))-f( u ><>)) p (fa) 

(f(u, x* P ) - f{u, 3c,pM)) P{duj) 



n 



+ £ P(C,) / — |— (/(a,, 5c ,pM) - /(a,, <)) P(£fc;) 
/ (/(o;,^)-/(a;,^ .))Pc,(^) 

3=1 3C i 

+ Em) / (/(o;,5c,pH)-/(^<))Pc j (^) 

3=1 ^ 



3 

(a) 

3=1 



^ P(Cj)S(x* P , P Cj ) + £ PWUgcf, P Cj ) 

3=1 3=1 



( = } S(x P , P C ) + £(<7c,P, ^c) = Pc) + L(g c ,P, P), 



where (a) follows from the definition of the conditional measure Pqj , (b) follows from the defini- 
tions of S(xp, PCj) an d L(g,Pcj), (c) follows from the notational convention ([S]) for functionals of 
measures, and (d) follows from Lemma [2j 



3. Proof of Proposition [2] 

For the given value of the answer, P 3 = Pc'j, j = 1, ... ,r. Therefore the agent can achieve 
the smallest possible loss by choosing the solution x* p . The resulting conditional loss will be 



L( Sj )= [ (f(oj,x Pc )-/( W ,x*H))dP c >). 



(Bl) 



2(3 



Taking the expectation of (|B1|) over possible values of the answer V*(C) we obtain 

„■ 1 „■ 1 J C-i 



3=1 5=1 

= Em) / (/(«,sc,j'H)-/(w I ^))^(«) 



3=1 



^WdKV'^) = L(g c ,P,Pc) = L(g c ,P,P), 

3=1 

where (a) follows from that for a perfect answer consistent with the original measure, Pr(V*(C) = 
Sj) = P(Cj), (b) follows from that the map gc,P is subset-optimal, (c) follows from the definition 
©, and (d) follows from Lemma [2j 

4. Proof of Lemma [4] 



(a) 



3=1 

= J2 P ^ I (f^,x P ) - f(u,g c , P {u>)))P Cj {dw) 

3=1 Cj ' 
t r . 

EE^'^ / ~ /( w >5c,pH))Pc j -(c^) 

3=1 fc=i ^ 

r r „ 

EE Pfc J Ufc / (f(u,x* P ) - f{u,gc,p{u)))Pcj{di 
i=i fe=i ^ n 

fc=i ^ Q 3=1 

r „ 

=?£>* / {f{u,x P )-f{u,g c , P (u)))P k {<l 
k=i J n 

= I>* [ (f(u,x P )-f(u J ,g c ,p(u J )) + f(u J ,x pk )-f(u J ,x pk ))P k (d 

= i2 V X [ (f(u,Xp)-f(0J,X pk ))P k (A 

k=l Jn 

+ T>fc [ (f(u,x pk ) - f(u,gcAu))) pk (& 
k=i Jn 

r r 

E v kS(x P , P k ) + E v k B(9C,P, P 



(b) 



10J 



tUJ) 



(£) 

k=l k=l 

= S(xp,P v{c) ) + B(g c ,p,P V (c)), 
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where (a) follows from the consistency condition (|A4|) . (b) follows from the fact that measure Pc^ 
vanishes outside of Cj, (c) follows from the form (|A3|) of the updated measures P k , (d) follows 
from the definitions ([2]) and of suboptimality and gain, and (e) follows from the definitions dH) 
and (fT2D . 

5. Proof of Proposition [3] 

The value s k of answer V(C) implies that the measure on O is equal to P k . Therefore the 

the agent can achieve minimum loss by using the stochastic optimal solution x* pk . The resulting 
minimum loss will be 

L{s k ) = L(g pk ,P k ), (B2) 

where g P k is the all-to-one map g P k{uj) = x* pk for all u € 0. 

The minimum expected loss L(V(C)) can be obtained by substituting (IB2jl into (TH1) : 

m 

£(F(C)) = J> fc L( 9pfc ,P fc ). (B3) 
fc=i 

On the other hand, we can decompose the EVPI L(g P , P) as follows. 

L(g P ,P)= [ (f( U ,x* P )-f(u,,xZ))P(du>) 
Jn 

m „ 

= VV / (/( W ,s£)-/( W ,x*))P fc (du;) 
fc=i ^ 

m „ 

= J2 V * (f(oo,x P ) - f(u,x*J + f(u,x pk ) - f(u,x pk ))P k (du) 

fc=l 

= / (/(w,xJ,)-/(w,a;^))^ fc (^J 

fc=i ^ 

m „ 

+ / (/(a; ja ;* fc )-/(w,x*))P*(du 

mm m 

= J2 VkS(x* P , P k ) + Y, v ^{g P u , P k ) = S(x* P , iV(o) + E u * L 0?p* > ( B4 ) 
fc=i fc=i fe=i 

Comparing (|B3[) with (|B4[) we can obtain 

£(V(C)) = L( 5P ,P) - S(x P ,P v(c) ). (B5) 
Finally, using the decomposition of EVPI of Corollary [2] in (IB5P yields 

£(F(C)) = B( 5c ,p, iV(c)) + L(jg c ,P, P). 



1UJ) 
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Appendix C: Examples of maps 

Let f2 be the interval [0, a] and let X be the real line K. Let the integrand f(u,x) have the 
following form: f(to, x) = (x — to) 2 and let the original measure P be the uniform continuous 
distribution on [0, a]. 

It is obvious that the optimal solution for the given realization uj is simply x* = to. The 
stochastic optimal map is gp(co) = § 6 X for all to € f2. Therefore the EVPI of the problem ([1]) is 

L( 5P , P) = 1 jf* ((xj, - - « - a;) 2 ) da; = I jT (| - a,)* du> = ^. 

Let C = { [0, f ) , [f , a] } and C = { [0, f ) U [§, f ) , [f , f ) U , a] } be two r = 2 partitions 
of O. Let us consider several different r = 2 maps <? € S (see Fig. [5] for an illustration). 

• g\ = (C,{|,^}) = 5c, p- The measures Pc 1 and Pc 2 are uniform on Ci and C 2 respectively. 
We have x* Pc = | and ip c = Thus gi G C. Note that in this case g\ € as well as it 
lies on the efficient frontier in (G,L) coordinate plane (see Fig. [6] for an illustration). 

• 92 = (C, {0, a}). For this map, the partition is the same as that for gi, but the image set is 
different. This map is therefore not subset-optimal: 52 ^ S. 

• <?3 = (C, ^}) = gc,p- For this map's partition both subsets C( and C2 consist of two 
connected components. It is easy to check that x* Pc = ^ and xp c = ^ and thus 53 € C. 

The loss for these three maps can be found as follows. For g\, 

T < m 12 r' 2 ( a 1 2 r /^ 3 « « 2 

ivioi, F) = - ■ — to) duo H — •-/ w au> = — , 

Vy1 ' ; 2 aj U J 2 aJ a/2 \A J 48' 

for g 2 , 

1 9 r a / 2 1 9 f a n 2 
L( 92 ,P) = -.- ( -a;) 2 ^ + -.- / (1- W ) 2 d w = 

2 a 7 2 a y a/2 12 

and for 53, 

^H!(f(M 2 -C(f-"H 

1 2 / Z" 1 / 2 (5a \ 2 , r /5a \ 2 , \ 13a 2 
H / w doo + w dw = . 

2 « Va/4 V 8 J V 8 / / 192 

Fig. [6] shows the efficient frontier and maps gi, g 2 and 53 in (G,L) coordinate plane. We see 
that g± € lies on the efficient frontier while g 2 and 53 are located above it. 



29 




a 3a X 

4 4 



o a c 2 




3a 5a X 

8 8 



FIG. 5. Maps <?i, 32 and 173. The partitions for g\ and 52 consist of connected sets only. Each element of 
the partition for 173 consists of two connected sets. 



Loss 




2.5 
Pseudo-energy 



FIG. 6. Maps gi, 52 and 53 on (G, L) coordinate plane. All possible maps for this problem lie in the shaded 
region, at or above the efficient frontier. 



Since 51, 53 € C we have (as Lemma [3] states) S(x* p , Pq) = ^ — fg = jq for g\ and S(x P , Pc>) = 
Tl ~ T§T = fi ^ or ^ 3 • ^ or ^ ne su boptimality is the same as that for g\. Note that, since 52 ^ C, 
S(x P ,P c ) + L(g 3 ,P) = ± L(g P , P). 



30 



For this one-dimensional example it turns out to be straightforward to find maps on the efficient 
frontier. Indeed, it is obvious that partitions for such maps have to consist of connected sets only. 
It is also clear that the order in which subsets Cj appear on the interval [0, a] does not matter 
because the integrand in (pQ) f(oj, x) depends on \co — x\ only. So, for the fixed value of r, any map 
g € 6 that can lie on the efficient frontier can be uniquely characterized by the subset measures 
Wj = P(Cj), j = 1, . . . , r. Given the values Wj, the expected loss of the corresponding map can be 
written as 

3=1 3=1 

In order to find the optimal values of Wj yielding the smallest loss for the question difficulty 
G(Q, C,P) not exceeding h the following optimization problem needs to be solved. 

r 

minimize Wj 

3=1 

r 

subject to — u(Cj)wj log Wj < h 

3=1 ' ' (CI) 

r 

Y. w i = l 

3=1 

Wj > 0, j = 1, ...,r, 

where u(Cj) is the pseudotemperature of subset Cj and h is a nonnegative parameter. Since the 
function — ]Cj=i u {^j) w j logWj is concave, (ICip is a global optimization problem. However it can 
easily be solved to optimality for moderate values of the partition size r. We consider two cases: 
constant pseudotemperature function u{ui) = 1 and linear pseudotemperature u(oj) = \w. We can 
assume that Cj = [aWj, a(vjj + Wj)]. In the former case, u(Cj) = 1, j = 1, . . . ,r and in the latter 
case, 

u(Cj) = 2wj + Wj , (C2) 

where Wj = Yji=i w i if J > 1 and wx = 0. 

The resulting efficient frontier is shown in Fig. [71 

Let us now consider imperfect answers to questions C in the same example. For simplicity, we 
set r = 2 for questions and assume the pseudotemperature to be constant on f2. We also assume 
all answers to be quasi-perfect so that the updated measures P k , k = 1, 2 have the form (|A9|) . 

The stochastic optimal solutions x* pk for measures P k can be found as 

x Jn 
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Loss 




FIG. 7. Efficient frontier for the toy example: constant pseudotemperature case (dotted line) and linear 
pseudotemperature case (solid line). 



We have 



x p i = arg mm 



1 - a(l - wi) f wia 



(x — uj) 2 duj + 



a 



wia Jo 
-{w\a + a(l — wi)a) = ~a(w\ + au>2) 



(x — u>) 2 doo 



and, analogously, 



x* P 2 = ~a(w2 + aw\) 



We can now find the suboptimalities: 



S(x P ,P')= I (f(co,x P )-f(io,xl(a)))Pi a \dco) 
Jn 

-.2 

((3 - 6w! + 3u> 2 )(l + a 2 ) + a(-6 + 12io a - 6u> 2 )) 



a 
12 



and, analogously, 

a 2 

5(xJ>, P 2 ) = — ((3 - 6w 2 + 3m^)(1 + a 2 ) + a(-6 + 12w 2 - 6u£)) 
The suboptimality 5(zp,Py(c)) is then 



S(x P ,P v(c) ) = Wl S(x P ,Pi a) ) + w 2 S(x P ,P^ 



(l-^f-^Xl-a) 2 . 



12 



32 




0.2- 



o H 1 1 1 1 1 1 1 1 1 1 1 

0.2 0.4 0.6 0.8 1 
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FIG. 8. Dependence of the expected loss on the added information for r — 2 partitions. The solid curve 
corresponds to the error- free message case with w\ varying from to 0.5. The dashed line shows the 
wi = 102 = 0.5 case with a varying from 1 to (from left to right on the figure). The dotted line is the 
same for w\ — 1 — W2 — 0.7 case, and the dash-dotted line is for w\ = 1 — u>2 = 0.9 case. 

The new value of the expected loss is 

L( 9P , P) - S(x P , P V{C) ) = ^ - ^(1 - w* - wi)(l - a) 2 (C3) 

Note that for a = we recover the expression L(gc,p,P) = ^(wf + iof) f° r a perfect answer 

2 

and for a = 1 the new value of the loss is simply L(gp, P) = since a = 1 describes the case in 
which the answer V(C) carries no new information and the updated measure is simply P. 

Fig. [8] shows the dependence of the expected loss (|C3|) on answer depth with the error parameter 
a ranging from to 1 for several values of subset measures w% and u>2 for the r = 2 case. The 
part of the efficient frontier that can be achieved for r = 2 is also shown (solid bold line). It is 
interesting to observe that, for the same amount of pseudoenergy, lower values of the expected loss 
can be achieved with imperfect answers to more difficult questions. 
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